EpiCompare compares different epigenetic datasets for benchmarking and quality control purposes. The report consists of three main sections:

  1. General Metrics: Metrics on fragments (duplication rate) and peaks (blacklisted peaks and widths)
  2. Peak Overlaps: Percentage and statistical significance of overlapping and unique peaks
  3. Functional Annotation: Functional annotation (ChromHMM, motif and enrich) of peaks.

Input peak files. Total of 2 samples:

## [1] "File1: ENCODE"
## [1] "File2: CnT"


1. General Metrics


Fragment Information

This information is displayed only if summary metrics from Picard is provided. See help manual.

  • Mapped_Fragments: Number of mapped read pairs in the file.
  • Duplication_Rate: Percentage of mapped sequence that is marked as duplicate.
  • Unique_Fragments: Number of mapped sequence that is not marked as duplicate.


Peak Information

  • Total_N: Total number of peaks including those blacklisted.
  • Blacklisted_Peaks: Percentage of blacklisted peaks present in the sample.


## Loading required package: GenomicRanges
## Loading required package: stats4
## Loading required package: BiocGenerics
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, append,
##     as.data.frame, basename, cbind, colnames, dirname, do.call,
##     duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
##     lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
##     pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
##     tapply, union, unique, unsplit, which.max, which.min
## Loading required package: S4Vectors
## 
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:base':
## 
##     I, expand.grid, unname
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
Sample Total_N Blacklisted_Peaks (%)
ENCODE 1 1.28
CnT 1670 5.39

Peak widths

Distribution of peak widths in each sample after removing blacklisted peaks.



2. Peak Overlaps

Individual samples

Heatmap of percentage overlaps between input peak files. Hover over the heatmap for percentage values.

Individual samples + other histone modifications

Heatmap of percentage overlaps between input peak files and other histone marks from ENCODE. Hover over the heatmap for percentage values.

Significance of overlapping vs unique peaks

The plot is displayed only if a reference peak file is provided and stat_plot = TRUE. Depending on the format of the reference file, the function output different plots:

  • If the reference file has BED6+4 format (peak called with MACS2), the plot is a paired boxplot showing a distribution of \(-log10(q-value)\) for overlapping and unique peaks per sample.
  • If the reference file does not have BED6+4 format, it generates a barplot of percentage overlap per sample, coloured by adjusted p-value.